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Abstract 

The "Gibbs Paradox" refers to several related questions concerning 
entropy in thermodynamics and statistical mechanics: whether it is an 
extensive quantity or not, how it changes when identical particles are 
mixed, and the proper way to count states in systems of identical parti- 
cles. Several authors have recognized that the paradox is resolved once 
it is realized that there is no such thing as the entropy of a system, that 
there are many entropies, and that the choice between treating particles 
as being distinguishable or not depends on the resolution of the experi- 
ment. The purpose of this note is essentially pedagogical; we add to their 
analysis by examining the paradox from the point of view of information 
theory. Our argument is based on that 'grouping' property of entropy that 
Shannon recognized, by including it among his axioms, as an essential re- 
quirement on any measure of information. Not only does it provide the 
right connection between different entropies but, in addition, it draws our 
attention to the obvious fact that addressing issues of distinguishability 
and of counting states requires a clear idea about what precisely do we 
mean by a state. 

1 Introduction 

Under the generic title of "Gibbs Paradox" one usually considers a number of 
related questions in both phenomenological thermodynamics and in statistical 
mechanics: (1) The entropy change when two distinct gases are mixed happens 
to be independent of the nature of the gases; is this in conflict with the idea that 
in the limit as the two gases become identical the entropy change should vanish? 
(2) Should the thermodynamic entropy of Clausius be extensive or not? Is this 
a mere convention, or a hard experimental fact? (3) Should two microstates 
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that differ only in the exchange of identical particles be counted as two or just 
one microstate? 

The conventional wisdom, as recorded in many textbooks, (see e.g. 
asserts that the resolution of the paradox rests on quantum mechanics. This 
analysis is unsatisfactory on many counts; at best it is incomplete. While it is 
true that the exchange of identical quantum particles does not lead to a new 
microstate this approach ignores the case of classical, and even non-identical 
particles. Nanoparticles in a colloidal suspension or macromolecules in solution 
provide us with simple examples: should the entropy of such systems of non- 
identical classical particles be extensive or not? 

A number of authors (including Grad, van Kampen, and Jaynes among oth- 
ers) have recognized that quantum theory has no bearing on the matter. Grad 
D and van Kampen Q| approach the problem from both the point of view of 
phenomenological classical thermodynamics and of statistical mechanics. Theirs 
is the orthodox version of a statistical mechanics that ultimately rests upon er- 
godic theory. Jaynes Q addresses the subject as a problem in phenomenological 
thermodynamics (see also ||). Perhaps surprisingly, Jaynes does not provide a 
statistical analysis from the point of view of information theory; it is only in the 
last few paragraphs of Q, and then only as a speculation, that he suggests that 
a more fundamental statistical treatment rooted in information theory might 
provide a more satisfactory and complete analysis. 

Whether one accepts information theory or ergodic theory as the correct 
foundation for statistical mechanics turns out to be immaterial in this case: all 
three of those authors agree that the paradox is resolved once it is realized that 
(1) the experimental data is usually silent on the matter of whether the thermo- 
dynamic entropy is extensive or not (extensivity is no more than a convenient 
convention). (2) There is no such thing as the entropy of a system, that there 
are many entropies. And (3) the appropriate choice of entropy, or equivalently 
the choice between treating particles as being distinguishable or not depends on 
the resolution of the particular experiment being performed. 

The purpose of this paper is essentially pedagogical; we add to their anal- 
ysis by examining the paradox from the point of view of information theory 
Our argument is guided by a certain property of entropy, the 'grouping' prop- 
erty, that Shannon recognized as an essential requirement on any measure of 
information. (Briefly, this property is the statement that the entropy of a prob- 
ability distribution over a set of alternatives is unchanged if the alternatives 
are grouped into subsets, each with its own entropy.) This provides the right 
language, and therefore a useful guide in navigating past various conceptual dif- 
ficulties. For example, the usual treatments of thermodynamics might mislead 
one to think in terms of only one entropy. On the other hand, the usual treat- 
ments of statistical mechanics, aim at providing the link between descriptions 
in terms of 'micro' and 'macro' states, and in the process, might mislead one 
to presuppose that no other intermediate 'meso' states are relevant. Focusing 
on the 'grouping' property steers the discussion in the right direction, namely 
towards studying the connection between the different entropies corresponding 
to different descriptions of the same system. 
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This paper is organized as follows: In section 2 we review the 'grouping' 
property and in section 3 we specify what we mean by the various states that 
enter our discussion. Sections 4 and 5 discuss the cases of identical and of non 
identical particles. In section 6 we offer some final remarks. 

2 The 'grouping' property 

Perhaps the easiest way to enunciate the 'grouping' property is to prove it. 

Our choice of notation and language will reflect, from the very start, the 
fact that our subject is statistical mechanics. A physical system can be in any 
one of a set of alternative states which will be labeled i = 1,2,.... Knowing 
that the system is found in state i requires complete information up to the most 
minute details about every particle in the system, their location, momentum, 
internal state, etc. Such detailed states are called microstates. For a classical 
system the microstates should more properly be described as points in a phase 
space continuum. It is convenient, however, to divide phase space into cells of 
arbitrarily small size and consider discrete classical microstates. 

The precise microstate of the system is not known; the limited information 
we possess allows us at best to assign probability Pi to each microstate i. The 
amount of additional information that would allow us to pinpoint the actual 
microstate is given by the entropy of the distribution pi , 

%]=-£ftlogJ»i. (1) 

i 

The set of microstates i (phase space) can be partitioned into non-overlapping 
subsets; the subsets are groups of microstates which will be labeled g. The sum 
in eq. (|TJ) can then be rearranged into 

%h-^^ Pl log K . (2) 

a 

The probability that the system is found in group g be 

Next consider the sum over groups g in eq.(|^). Each term in this sum can be 
rearranged as follows 

- Pi l °SPi = -PaT,W l °S JT- p a log P a ■ (4) 

Let Pi\ g denote the conditional probability that the system is in microstate i £ g 
given we know it is in the group <?, 



Then eq.(jl]) can be written as 



where 



and 



S = S G + J2P a S g , (6) 
g 

S G = -Y / Pg^gP 9 (7) 
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S g = - Pi\g l °gP*\g ■ (8) 
ieg 

Eq.(|^) is the 'grouping' property we seek. It can be read as follows: the 
information required to locate the system in one of its microstates i equals the 
information required to locate the system in one of the groups g, plus the expec- 
tation over the groups of the information required to pinpoint the microstate 
within each group. 

Shannon regarded the 'grouping' property as an obvious requirement to 
be imposed on any measure of information and included it among his axioms 
leading to eq.(^). To others, who did not seek a measure of amount information, 
but rather a method to take information into account, eq.(q) is not an obviously 
unavoidable requirement. Thus, other derivations of eq.(Q) have been proposed 
[|| from which entropy emerges as a tool for consistent reasoning. In these 
approaches entropy does not need to be interpreted in terms of heat, disorder, 
uncertainty, or lack of information: entropy needs no interpretation. 



3 Entropies and descriptions 

The 'grouping' property, eq.^), plays an important role because it establishes 
a relation between two different descriptions and, in doing so, it invokes three 
different entropies (none of which is the thermodynamical entropy of Clausius). 

We describe the system with high resolution as being in any one of its mi- 
crostates i with probability pi , or alternatively, with lower resolution as being 
in any one of the groups g with probability P g . Since the description in terms 
of groups is less detailed we might refer to them as 'mesostates'. 

A thermodynamical description, on the other hand, corresponds to neither 
the high resolution description in terms of microstates, nor the lower resolution 
description in terms of mesostates. It is a description that incorporates the 
least amount of information necessary for a totally macroscopic characteriza- 
tion of equilibrium. The state that is relevant here is defined by the values of 
those variables the control of which guarantees the macroscopic reproducibility 
of experiments. Such states are called macrostates. The typical thermody- 
namic variables include the energy, volume, magnetization, etc., but here, for 
simplicity, we will consider only the energy since including other macrovariables 
is straightforward and does not modify the gist of the argument. 



4 



The standard connection between the thermodynamic description in terms 
of macrostates and the description in terms of microstates is established using 
the Method of Maximum Entropy. Let the energy of microstate i be £j. To 
the macrostate of energy E we associate that probability distribution pi which 
maximizes the entropy ([!]) subject to the constraints 

^2 Pt = i and X! Pi£i = E ■ ( 9 ) 

i i 

The well-known result is the canonical distribution, 

e -0e t 

Pi = ~y- , (10) 

where the partition function Zh and the Lagrange multiplier (3 are determined 
from 

ZH = 5>-*. and d -^ = ~E. (11) 

i 

The corresponding entropy, obtained by substituting eq. (|l0|) into eq. ([[]), 

S H = [3E + log Z h , (12) 

measures the amount of information beyond the value E to specify the mi- 
crostate. 

Before we compute and interpret the probability distribution over mesostates 
and its corresponding entropy we must be more specific about which mesostates 
we are talking about. This is what we do next. 

4 Identical particles 

Consider a system of N classical particles that are exactly identical. The parti- 
cles may or may not interact with each other (i.e., the argument is not limited 
to ideal gases). 

The interesting question is whether these identical particles are also 'distin- 
guishable'. By this we mean the following: we look at two particles now and 
we label them. We look at the particles later; somebody might have switched 
them. Can we tell which particle is which? The answer is: it depends. Whether 
we can distinguish identical particles or not depends on whether we were able 
(and willing) to follow their trajectories. 

A slightly different version of the same question concerns an iV-particle sys- 
tem in a certain state. Some particles are permuted. Does this give us a different 
state? As discussed in the previous section, the answer to this question requires 
a careful specification of what we mean by a state. 

If by state we mean a microstate, that is a point in the iV-particle phase 
space, then a permutation does indeed lead to a new microstate. On the other 
hand, our concern with permutations suggests that it is useful to introduce the 
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notion of a mesostatc denned as the group of those Nl microstates that are 
obtained as permutations of each other. With this definition it is clear that a 
permutation of the identical particles does not lead to a new mesostate. 

Now we can return to discussing the connection between the thermodynamic 
macrostate description and the description in terms of mesostates using, as 
before, the Method of Maximum Entropy. Since the particles are identical, all 
those microstates i within the same mesostate g have the same energy, which 
we will denote by e g (i.e., Si = e g for all i € g). To the macrostate of energy E 
we associate that probability distribution P g which maximizes the entropy (frj) 
subject to the constraints 

E P 9 = 1 ^ H P 9 £ 9= E - ( 13 ) 

a a 
The result is also a canonical distribution, 

P 9 = —' ( 14 ) 

where 

Z L =J2e~^ and = ~E . (15) 

a 

The corresponding entropy, obtained by substituting eq.(|l4|) into eq.(^), 

S L =pE + \ogZ L , (16) 

measures the amount of information beyond the value E to specify the mesostate. 

Notice that two different entropies Sh and Sl have been assigned to the same 
macrostate E; they measure the different amounts of additional information 
required to specify the state of the system to a high resolution (the microstate) 
or to a low resolution (the mesostate). 

The relation between Zh and Zl can be obtained from eqs.(||), ( JlC| ) and 
(0: 

Z ' = f' < 1? > 
The relation between Sh and Sl is obtained from the 'grouping' property. First 
use eq.(||) to get pi\ g = 1/N\, and then substitute into eq.(0) (with S = Sh and 
Sg — Sl) to get 

S L = S H - log Nl. (18) 



Equations ( |17| ) and (18) both exhibit the Gibbs Nl 'corrections'. Our analy- 
sis shows (1) that the justification of the Nl factor is not to be found in quantum 
mechanics, and (2) that the Nl does not correct anything. The AH is not a fudge 
factor that fixes a wrong (possibly nonextensive) entropy Sh into a correct (pos- 
sibly extensive) entropy Sl- Both entropies Sh and Sl are correct. They differ 
because they measure different things: one measures the information to specify 
the microstate, the other measures the information to specify the mesostate. 
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An important goal of statistical mechanics is to provide a justification, an 
explanation of thermodynamics. Thus, we still need to ask which of the two 
statistical entropies, Sh or Sl, should be identified with the thermodynamical 
entropy of Clausius S cxp . Inspection of eqs.([l7]) and ( |l8| ) shows that, as long 
as one is not concerned with experiments that involve changes in the number 
of particles, the same thermodynamics will follow whether we set Sh = S exp or 
Sl = S cxp . This is the conclusion reached by Grad, van Kampen and Jaynes. 

But, of course, experiments involving changes in N are very important (for 
example, in the equilibrium between different phases, or in chemical reactions). 
Since in the usual thermodynamical experiments we only care that some number 
of particles has been exchanged, and we do not care which were the actual 
particles exchanged, we expect that the correct identification is Sl = S cxp . 
Indeed, the quantity that regulates the equilibrium under exchanges of particles 
is the chemical potential defined by 

"-=M^L... (i9) 

(This is analogous to the temperature, an intensive quantity that regulates the 
equilibrium under exchanges of heat.) The two identifications Sh = S cxp or 
Sl = 'S'exp, lead to two different chemical potentials, related by 

H L =VH-NkT. (20) 

It is easy to verify that, under the usual circumstances where surface effects can 
be neglected relative to the bulk, fi L has the correct functional dependence on 
N: it is intensive and can be identified with /z cxp . On the other hand, /i H is not 
an intensive quantity and cannot therefore be identified with /i oxp . 



5 Non- identical particles 

In the last section we saw that classical identical particles can be treated, de- 
pending on the resolution of the experiment, as being distinguishable or indis- 
tinguishable. In this section we go further and point out that even non-identical 
particles can be treated as indistinguishable. Our goal is to state explicitly in 
precisely what sense it is up to the observer to decide whether particles are 
distinguishable or not. 

We defined a mesostate as a subset of N\ microstates that are obtained as 
permutations of each other. With this definition it is clear that a permutation 
of particles does not lead to a new mesostate even if the exchanged particles 
are not identical. This is an important extension because, unlike quantum par- 
ticles, classical particles cannot be expected to be exactly identical down to 
every minute detail. In fact in many cases they are grossly different. Consider 
the example of milk, i.e., a colloidal suspension of fat droplets in water, or a 
solution of macromolecules. A high resolution device, for example an electron 
microscope, would reveal that no two fat droplets or two macromolecules are 
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exactly alike. And yet, for the purpose of modelling most of our macroscopic ob- 
servations (i.e., the thermodynamics of milk) it is not necessary to take account 
of the myriad ways in which two fat droplets can differ. 

Consider a system of N particles. We can perform rather crude macroscopic 
experiments the results of which can be summarized with a simple phenomeno- 
logical thermodynamics where N is one of the relevant variables that define the 
macrostate. Our goal is to construct a statistical foundation that will explain 
this macroscopic model, reduce it, so to speak, to 'first principles'. The particles 
might ultimately be non-identical, but the crude phenomenology is not sensi- 
tive to their differences and can be explained by postulating mesostates g and 
microstates i with well defined energies = e g , for alH G g, as if the particles 
were identical. As in the previous section this statistical model gives 

Zl = ^ with Z H = Y^e-^, (21) 

i 

and the connection to the thermodynamics is established by postulating 

S cxp = S L = S H - log Nl. (22) 

Next we consider what happens when more sophisticated experiments are 
performed. The examples traditionally offered in discussions of this sort refer 
to the new experiments made possible by the discovery of membranes that are 
permeable to some of the N particles but not to the others. Other, perhaps 
historically more realistic examples, are afforded by the availability of new ex- 
perimental data, for example, more precise measurements of a heat capacity as 
a function of temperature, or perhaps measurements in a range of temperatures 
that had previously been inaccessible. 

Suppose the new phenomenology can be modelled by postulating the exis- 
tence of two kinds of particles. (Experiments that are even more sophisticated 
might allow us to detect three or more kinds, perhaps even a continuum of 
different particles.) What we previously thought were N identical particles we 
will now think as being N a particles of type a and Nb particles of type b. The 
new description is in terms of macrostates defined by N a and N b as the relevant 
variables. 

To construct a statistical explanation of the new phenomenology from 'first 
principles' we need to revise our notion of mesostate. Each new mesostate will 
be a group of microstates which will include all those microstates obtained by 
permuting the a particles among themselves, and by permuting the b particles 
among themselves, but will not include those microstates obtained by permuting 
a particles with b particles. The new mesostates, which we will label g and to 
which we will assign energy s g , will be composed of N a \Nb\ microstates i, each 
with a well defined energy ei — e g , for all i S g. The new statistical model gives 



and the connection to the new phenomenology is established by postulating 

S„ow cxp = S L = S H - log NJN b \. (24) 

In discussions of this topic it is not unusual to find comments to the effect 
that in the limit as particles a and b become identical one expects that the 
entropy of the system with two kinds of particles tends to the entropy of a 
system with just one kind of particle. The fact that this expectation is not met 
is one manifestation of the Gibbs paradox. 

From the information theory point of view the paradox does not arise because 
there is no such thing as the entropy of the system, there are several entropies. 
It is true that as a — > b we will have Zh — > Zh, and accordingly Sh Sh, 
but there is no reason to expect a similar relation between Sl and Sl because 
these two entropies refer to mesostates g and g that remain different even as 
a and b became identical. In this limit the mesostates g, which are useful for 
descriptions that treat particles a and b as indistinguishable among themselves 
but distinguishable from each other, lose their usefulness. 

6 Conclusion 

We conclude with a comment and a quotation. First, our comment. 

The Gibbs paradox in its various forms arises from the widespread miscon- 
ception that entropy is a real physical quantity and that one is justified in talking 
about the entropy of the system. The thermodynamic entropy is not a property 
of the system. It is somewhat more accurate to assert that entropy is a property 
of our description of the system, it is a property of the macrostate. More explic- 
itly, it is a function of the macroscopic variables used to define the macrostate. 
To different macrostates reflecting different choices of variables there correspond 
different entropies for the very same system. 

But this is not the complete story: the entropy is not just a function of 
the macrostate. Entropies reflect a relation between two descriptions of the 
same system: in addition to the macrostate, we must also specify the set of 
microstates, or the set of mesostates, as the case might be. Then, having speci- 
fied the macrostate, an entropy can be interpreted as the amount of additional 
information required to specify the microstate or mesostate. We have found 
the 'grouping' property very valuable precisely because it emphasizes this de- 
pendence of entropy on a second argument, namely, the choice of micro or 
mesostates. 

The promissed quotation is a remark by van Kampen jjj that very aptly 
captures the issue of whether identical classical particles are distinguishable or 
not: 

"The question is not whether the particles are identical in eyes 
of God, but merely in the eyes of the beholder." 

This is a surprisingly perceptive remark, particularly coming from someone who 
strongly opposed the information theory approach to statistical mechanics. 
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